Entities as topic labels: Improving topic interpretability and evaluability combining Entity Linking and Labeled LDA

نویسندگان

  • Federico Nanni
  • Pablo Ruiz
چکیده

Hurvitz, A. (2013). Late Biblical Hebrew, Khan. Khan, G. (ed.) (2013). Encyclopedia of Hebrew Language and Linguistics, Vol. 4, Leiden, Brill, 2013. Kutscher, E. Y. (1974). The Language and Linguistic Background of the Isaiah Scroll (1QIsaa), STDJ 6. Leiden, Brill. Oosting, R., Dyk, J. and Glanz, O., Valence Patterns of Motion Verbs, Semantics, Syntax and Linguistic Variation, to be published. Roorda, D. (2015a). Parallel Passages, https://shebanq.ancientdata.org/tools?goto=parallel Roorda, D. (2015b). The Hebrew Bible as Data: Laboratory Sharing – Experience, http://arxiv.org/abs/1501.01866 Saenz Badillos, A. (2004). A History of the Hebrew Language, Cambridge: Cambridge University Press. Segarra, S., Eisen, E. and Ribeiro, A. (2013). Authorship Attribution Using Function Words Adjacency Networks, Proc. Int. Conf. Acoustics Speech Signal Processing: 5563-5567. SHEBANQ, https://shebanq.ancient-data.org Van Peursen, W. T., et al. (2015). Hebrew Text Database ETCBC4b. DANS. http://dx.doi.org/10.17026/dans-z6y-skyh Young, I., Rezetko, R. and Ehrensvärd, M. (2008). Linguistic Dating of Biblical Texts, 2 volumes, London: Equinox Publishing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PAYMA: A Tagged Corpus of Persian Named Entities

The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...

متن کامل

یک مدل موضوعی احتمالاتی مبتنی بر روابط محلّی واژگان در پنجره‌های هم‌پوشان

A probabilistic topic model assumes that documents are generated through a process involving topics and then tries to reverse this process, given the documents and extract topics. A topic is usually assumed to be a distribution over words. LDA is one of the first and most popular topic models introduced so far. In the document generation process assumed by LDA, each document is a distribution o...

متن کامل

The Polylingual Labeled Topic Model

In this paper, we present the Polylingual Labeled Topic Model, a model which combines the characteristics of the existing Polylingual Topic Model and Labeled LDA. The model accounts for multiple languages with separate topic distributions for each language while restricting the permitted topics of a document to a set of predefined labels. We explore the properties of the model in a two-language...

متن کامل

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...

متن کامل

The THU Summarization Systems at TAC 2010

The TAC 2010 Guided Summarization task requires participants to generate coherent summaries with the guidance of predefined categories and aspects. In this paper, we present our two extractive summarization systems. In the first system, we employ a topic model Labeled LDA to model the aspects. The correspondence between the aspects and the topics in Labeled LDA is established through identifyin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1604.07809  شماره 

صفحات  -

تاریخ انتشار 2016